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The theory of the strong force, quantum chromodynamics, describes the proton in 
terms of quarks and gluons. The proton is a state of two up quarks and one down 
quark bound by gluons, but quantum theory predicts that in addition thereis an 
infinite number of quark—antiquark pairs. Both light and heavy quarks, whose mass 
is respectively smaller or bigger than the mass of the proton, are revealed inside the 
proton in high-energy collisions. However, it is unclear whether heavy quarks also 
exist as a part of the proton wavefunction, which is determined by non-perturbative 
dynamics and accordingly unknown: so-called intrinsic heavy quarks’. It has been 
argued for along time that the proton could have a sizable intrinsic component of 
the lightest heavy quark, the charm quark. Innumerable efforts to establish intrinsic 
charm in the proton’ have remained inconclusive. Here we provide evidence for 
intrinsic charm by exploiting a high-precision determination of the quark—-gluon 
content of the nucleon® based on machine learning and a large experimental dataset. 
We disentangle the intrinsic charm component from charm-anticharm pairs arising 
from high-energy radiation‘. We establish the existence of intrinsic charm at the 
3-standard-deviation level, with amomentum distribution in remarkable agreement 
with model predictions’*.We confirm these findings by comparing them to very recent 
data on Z-boson production with charm jets from the Large Hadron Collider beauty 
(LHCb) experiment’. 


The foundational deep-inelastic scattering experiments at the SLAC lin- 
ear collider inthe late 1960s and early 1970s demonstrated the presence 
inside the proton of point-like constituents, soon identified with quarks, 
the elementary particles that interact and are bound inside the proton by 
gluons, the carriers of the strong nuclear force. It was rapidly clear, and 
confirmed in detail by subsequent studies, that these point-like constitu- 
ents, collectively called partons by Feynman’, include the up and down 
quarks that carry the proton quantum numbers, but also gluons, as well as 
an infinite number of pairs of quarks and their antimatter counterparts, 
antiquarks. The description of electron-proton and proton-proton 
collisions at high momentum transfers in terms of collisions between 
partonsis now rooted inthetheory of quantum chromodynamics (QCD), 
and it provides the basis of modern-day precision phenomenology at 
proton accelerators such as the Large Hadron Collider (LHC) of CERN® 
as well as for future facilities including the Electron-Ion Collider’, the 
Forward Physics Facility’° and neutrino telescopes”. 

Knowledge of the structure of the proton, which is necessary to 
obtain quantitative prediction for physics processes at the LHC and 
other experiments, is encoded in the distribution of momentum carried 
by partons of each type (gluons, up quarks, down quarks, up antiquarks 
and so on): parton distribution functions (PDFs). These PDFs could, in 
principle, be computed from first principles, but in practice even their 
determination from numerical simulations” is extremely challenging. 
Consequently, the only strategy available at present for obtaining the 
reliable determination of the proton PDFs that is required to evaluate 
LHC predictions is empirical, through the global analysis of data for 


which precise theoretical predictions and experimental measurements 
are available, so that the PDFs are the only unknown’, 

Although this successful framework has by now been worked through 
in great detail, several key open questions remain open. One of the 
most controversial of these concerns the treatment of so-called heavy 
quarks (that is, those whose mass is greater than that of the proton; 
m, = 0.94 GeV). Indeed, virtual quantum effects and energy-mass con- 
siderations suggest that the three light quarks and antiquarks (up, down 
and strange) should all be present in the proton wavefunction. Their 
PDFs are therefore surely determined by the low-energy dynamics that 
controls the nature of the proton as a bound state. However, it is well 
known®!** that in high enough energy collisions all species of quarks 
can be excited and hence observed inside the proton, so their PDFs are 
nonzero. This excitation follows from standard QCD radiation, and it 
can be computed accurately in perturbation theory. 

However, then the question arises of whether heavy quarks also 
contribute to the proton wavefunction. Such a contribution is called 
intrinsic, to distinguish it from that computable in perturbation theory, 
which originates from QCD radiation. Already since the dawn of QCD, 
it was argued that all kinds of intrinsic heavy quark must be present 
in the proton wavefunction”. In particular, it was suggested’ that the 
intrinsic component could be non-negligible for the charm quark, 
whose mass (m, ~ 1.51 GeV) is of the same order of magnitude as the 
mass of the proton. 

This question has remained highly controversial, and indeed recent 
dedicated studies have resulted in disparate claims, from excluding 
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Fig. 1| The intrinsic charm PDF and comparison with models. Left, the 
purely intrinsic (3FNS) result (blue) with PDFU alone, compared to the 4FNS 
PDF, which includes both an intrinsic and a radiative component, at 
Q=m,=1.51GeV (orange). The purely intrinsic (3FNS) result obtained using 
N7LO matching is also shown (green). Right, the purely intrinsic (FNS) final 


momentum fractions carried by intrinsic charm larger than 0.5% at the 
4-standard-deviation (40) level” to allowing up toa 2% charm momentum 
fraction’®. A particularly delicate issue in this context is that of separating 
the radiative component: finding that the charm PDF is nonzero at alow 
scale is not sufficient to argue that intrinsic charm has been identified. 

Here we present a resolution of this four-decade-long conundrum 
by providing unambiguous evidence for intrinsic charm in the proton. 
This is achieved by means of a determination of the charm PDF (ref. *) 
from an extensive hard-scattering global dataset, using state-of-the-art 
perturbative QCD calculations”, adapted to accommodate the pos- 
sibility of massive quarks inside the proton*?°”, and sophisticated 
machine learning techniques*”””*. This determination is performed 
at next-to-next-to-leading order (NNLO) in an expansion in powers 
of the strong coupling, a,, which represents the precision frontier for 
collider phenomenology. 

The charm PDF determined in this manner includes a radiative com- 
ponent, and indeed it depends on the resolution scale: it is givenina 
four-flavour number scheme (4FNS), in which up, down, strange and 
charm quarks are subject to perturbative radiative corrections and 
mix with each other and the gluon as the resolution is increased. The 
intrinsic charm component can be disentangled from it as follows. 
First, we note that in the absence of an intrinsic component, the initial 
condition for the charm PDF is determined using perturbative match- 
ing conditions”, computed up to NNLO in ref. *°, and recently (partly) 
extended up to next-to-next-to-next-to-leading order (N°LO; refs. 7°**). 
These matching conditions determine the charm PDF in terms of the 
PDFs of the 3FNS, in which only the three lightest quark flavours are 
radiatively corrected. Hence, this perturbative charm PDF is entirely 
determined in terms of the three light quarks and antiquarks and the 
gluon. However, the 3FNS charm quark PDF needs not vanish: in fact, 
if the charm quark PDF in the 4FNS is freely parametrized and thus 
determined from the data‘, the matching conditions can be inverted. 
The 3FNS charm PDF thus obtained is then by definition the intrinsic 
charm PDF: indeed, inthe absence of intrinsic charm it would vanish”. 
Thus, unlike the 4FNS charm PDF, which includes both an intrinsic 
and a radiative component, the 3FNS charm PDF is purely intrinsic. In 
this work we have performed this inversion at NNLO (ref. 7°) as well as 
at N°7LO (refs. 7°**), which—as we shall see—provides a handle on the 
perturbative uncertainty of the NNLO result. 

Our starting point is the NNPDF4.0 global analysis’, which provides 
a determination of the sum of the charm and anticharm PDFs, 
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result with total uncertainty (PDFU + MHOU), with the PDFU indicated as a dark 
shaded band; the predictions from the original BHPS model! and from the more 
recent meson/baryon cloud model are also shown for comparison (dotted and 
dot-dashed curves, respectively). 


namely c*(x, Q) =c(x, Q) + C(x, Q), inthe 4FNS. This can be viewed asa 
probability density inx, the fraction of the proton momentum carried 
by charm, in the sense that the integral over all values of O<x<1lof 
xc*(x) is equal to the fraction of the proton momentum carried by charm 
quarks, although note that PDFs are generally not necessarily positive 
definite. Our result for the 4FNS xc’ (x, Q) at the charm mass scale, Q =m, 
with m, = 1.51 GeV, is shown in Fig. 1 (left). The ensuing intrinsic charm 
is determined from it by transforming to the 3FNS using NNLO match- 
ing. This result is also shown in Fig. 1 (left). The bands indicate the 68% 
confidence level interval associated with the PDF uncertainties (PDFU) 
in each case. Henceforth, we will refer to the 3FNS xc’(x, Q) PDF as the 
intrinsic charm PDF. 

The intrinsic (3FNS) charm PDF exhibits a characteristic valence-like 
structure at large x peaking atx ~ 0.4. Although intrinsic charm is found 
to be small in absolute terms (it contributes less than 1% to the proton 
total momentum), it is significantly different from zero. Note that the 
transformation to the 3FNS has little effect on the peak region, because 
there is almost no charm radiatively generated at such large values of 
x: in fact, a very similar valence-like peak is already found in the 4FNS 
calculation. 

As atthe charm mass scale the strong coupling a, is rather large, the 
perturbative expansion converges slowly. To estimate the effect of 
missing higher-order uncertainties (MHOU), we have also performed 
the transformation from the 4FNS NNLO charm PDF determined from 
the data to the 3FNS (intrinsic) charm PDF at one order higher, namely 
at N°LO. The result is also shown Fig. 1 (left). Reassuringly, the intrinsic 
valence-like structure is unchanged. On the other hand, it is clear that 
forx < 0.2 perturbative uncertainties become very large. We can esti- 
mate the total uncertainty on our determination of intrinsic charm by 
adding in quadrature the PDFU and a MHOU estimated from the shift 
between the result found using NNLO and N7LO matching. 

This procedure leads to our final result for intrinsic charm and its 
total uncertainty (shown in Fig. 1, right). The intrinsic charm PDF is 
found to be compatible with zero for x < 0.2: the negative trend seen 
in Fig. 1 with PDFU becomes compatible with zero only on inclusion 
of theoretical uncertainties. However, at larger x, even with theoreti- 
cal uncertainties the intrinsic charm PDF differs from zero by about 
2.50 in the peak region. This result is stable on variations of dataset, 
methodology (in particular the PDF parametrization basis) and values 
of parameters (specifically the charm mass) of the standard model, as 
demonstrated in Supplementary Sections C and D. 
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Fig. 2|IntrinsiccharmandZ + charm production at LHCb. Top left, the 
LHCb measurements of Z-boson production in association with charm-tagged 
jets, RS, at /s =13 TeV, compared with our default prediction, which includes 
anintrinsic charm component, as well as with a variant in which we impose 

the vanishing of the intrinsic charm component. The thicker (thinner) bands 
in the LHCb data indicate the statistical (total) uncertainty, while the theory 
predictions include both PDFU and MHOU. Top right, the correlation 
coefficient between the charm PDF at Q = 100 GeV in NNPDF4.0 and the LHCb 
measurements of R$ for the three y, bins. The dotted horizonal line indicates 


Our determination of intrinsic charm can be compared to theoreti- 
cal expectations. Subsequent to the original intrinsic charm model of 
ref.'(BHPS model), a variety of other models were proposed*** *8 (see 
ref. * for a review). Irrespective of their specific details, most models 
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the maximum possible correlation. Centre, the charm PDF inthe 4FNS (right) 
and the intrinsic (@FNS) charm PDF (left) before and after inclusion of the LHCb 
Z+ charm (c) data. Results are shown for both experimental correlation models 
discussed inthe text. Bottom left, the intrinsic charm PDF before and after 
inclusion of the EMC charm structure function data. Bottom right, the 
statistical significance of the intrinsic charm PDF in our baseline analysis, 
compared tothe results obtained also including the LHCb Z+ charm (with 
uncorrelated systematics) or the EMC structure function data, or both. The 
dotted horizontal line indicates the 3othreshold. 


predict a valence-like structure at large x with a maximum located 
between x ~ 0.2 and x ~ 0.5, and a vanishing intrinsic component for 
x S$ 0.1. In Fig. 1 (right), we compare our result to the original BHPS 
model and to the more recent meson/baryon cloud model of ref. °. 
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As these models predict only the shape of the intrinsic charm dis- 
tribution, but not its overall normalization, we have normalized them 
by requiring that they reproduce the same charm momentum fraction 
as our determination. We find remarkable agreement between the 
shape of our determination and the model predictions. In particular, 
we reproduce the presence and location of the large-x valence-like peak 
structure (with better agreement, of marginal statistical significance, 
with the meson/baryon cloud calculation), and the vanishing of intrin- 
sic charm at small x. The fraction of the proton momentum carried by 
charm quarks that we obtain from our analysis, used in this comparison 
to models, is (0.62 + 0.28)% including PDFU alone (see Supplementary 
Section E for details). However, the uncertainty on inclusion of MHOU 
greatly increases, and we obtain (0.62 + 0.61)%, due tothe contribution 
from the small-x region, x < 0.2, where the MHOU is very large (Fig. 1, 
right). Note that in most previous analyses'® (see Supplementary Sec- 
tion F) intrinsic charm models (such as the BHPS model) are fitted to 
the data, with only the momentum fraction left as a free parameter. 

We emphasize that in our analysis the charm PDF is entirely deter- 
mined by the experimental data included in the PDF determination. 
The data with the most impact on charm are from recently measured 
LHC processes, which are both accurate and precise. As these measure- 
ments are madeat high scales, the corresponding hard cross-sections 
can be reliably computed in QCD perturbation theory. 

Independent evidence for intrinsic charm is provided by the very 
recent LHC beauty (LHCb) measurements of Z-boson production in 
association with charm-tagged jets in the forward region®, which were 
not included in our baseline dataset. This process, when measured in 
terms of the ratio R§ of charm-tagged jets normalized to flavour- 
inclusive jets, is directly sensitive to the charm PDF (ref. *’), and with 
LHCb kinematics also in the kinematic region where the intrinsic com- 
ponentis relevant. Following refs. °”°, we have evaluated R§ at NLO 
(refs. *°*1; see Supplementary Section G for details), both with our 
default PDFs that include intrinsic charm, and also with an independ- 
ent PDF determination in which intrinsic charm is constrained to vanish 
identically, so charm is determined by perturbative matching (see 
Supplementary Section B). 

In Fig. 2 (top left) we compare the LHCb measurements of R§, provided 
inthree bins of the Z-boson rapidity y,, with the theoretical predictions 
based on both our default PDFs and the PDF set in which we impose the 
vanishing of intrinsic charm. In Fig. 2 (top right) we also show the cor- 
relation coefficient between the charm PDF at Q=100 GeV and the 
observable R§, demonstrating how this observable is highly correlated 
to charm ina localized x region that depends on the rapidity bin. It is 
clear that our prediction is in excellent agreement with the LHCb meas- 
urements, and in particular for the highest-rapidity bin, whichis highly 
correlated to the charm PDF in the region of the observed valence peak 
x 0.45, the prediction obtained by imposing the vanishing of intrinsic 
charm undershoots the data at the 3a level. Hence, this measurement 
provides independent direct evidence in support of our result. 

We have also determined the impact of these LHCb Z + charm meas- 
urements on the charm PDF. As the experimental covariance matrix is 
not available, we have considered two limiting scenarios in which the 
total systematic uncertainty is either completely uncorrelated (9,,, = 0) 
or fully correlated (p,,, = 1) between rapidity bins. The charm PDF in 
the 4FNS before and after inclusion of the LHCb data (with either cor- 
relation model), and the intrinsic charm PDF obtained from it, are 
shown in Fig. 2 (centre left and right, respectively). The bands account 
for both PDFU and MHOU. The results show full consistency: inclusion 
of the LHCb ?§ data leaves the intrinsic charm PDF unchanged, while 
moderately reducing the uncertainty on it. 

Inthe past, the main indication for intrinsic charm came from EMC 
data” on deep-inelastic scattering with charm in the final state*. These 
data are relatively imprecise, their accuracy has often been questioned, 
and they were taken at relatively low scales for which radiative correc- 
tions are large. For these reasons, we have not included them in our 
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baseline analysis. However, it is interesting to assess the impact of their 
inclusion. Results are shown in Fig. 2 (bottom left), which shows the 
intrinsic charm PDF before and after inclusion of the EMC data. Asin 
the case of the LHCb data, we find full consistency: unchanged shape 
and a moderate reduction of uncertainties. 

We cansummarize our results through their so-called local statistical 
significance (namely, the size of the intrinsic charm PDF in units of its 
total uncertainty). This is shown in Fig. 2 (bottom right) for our default 
determination of intrinsic charm, as wellas after inclusion of either the 
LHCb Z+ charm or the EMC data, or both. We find a local significance 
for intrinsic charm at the 2.5a level in the region 0.3 sx < 0.6. This is 
increased to about 30 by the inclusion of either the EMC or the LHCb 
data, and above if they are both included. The similarity of the impact 
ofthe EMC and LHCb measurements is especially remarkable in view of 
the fact that they involve very different physical processes and energies. 

In summary, in this work we have presented evidence for intrinsic 
charm quarks in the proton. Our findings close a fundamental open 
question in the understanding of nucleon structure that has been 
hotly debated by particle and nuclear physicists for the past 40 years. 
By carefully disentangling the perturbative component, we obtain 
unambiguous evidence for intrinsic charm, which turns out to bein 
qualitative agreement with the expectations from model calculations. 
Our determination of the charm PDF, driven by indirect constraints 
from the latest high-precision LHC data, is perfectly consistent with 
direct constraints from both EMC charm production data taken 40 
years ago, and very recent Z + charm production data in the forward 
region from LHCb. Combining all data, we find a local significance 
for intrinsic charm in the large-x region just above the 30 level. Our 
results motivate further dedicated studies of intrinsic charm througha 
wide range of nuclear, particle and astroparticle physics experiments, 
such as those accessible at the High-Luminosity LHC (ref. **) and the 
fixed-target programmes of LHCb (ref. **) and A Large Ion Collider 
Experiment (ALICE)*, to the Electron-lIon Collider, AFTER (ref. *”), the 
Forward Physics Facility** and neutrino telescopes”. 
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Methods 


The strategy adopted in this work to determine the intrinsic charm con- 
tent of the proton is based on the following observation. The assump- 
tion that there is no intrinsic charm amounts to the assumption that 
all 4FNS PDFs are determined” using perturbative matching condi- 
tions” in terms of 3FNS PDFs that do not include a charm PDF. However, 
these perturbative matching conditions are actually given by asquare 
matrix that also includes a 3FNS charm PDF. Thus, the assumption of 
no intrinsic charm amounts to the assumption that if the 4FNS PDFs 
are transformed back to the 3FNS, the 3FNS charm PDF is found to 
vanish. Hence, intrinsic charm is by definition the deviation from zero 
of the 3FNS charm PDF (ref. 4). Note that whereas the 3FNS charm PDF 
is purely intrinsic, the 4FNS charm PDF includes both an intrinsic anda 
perturbative radiative component, but the 4FNS intrinsic componentis 
not equal to the 3FNS charm PDF, because matching conditions reshuf- 
fle all PDFs among each other. 

Intrinsic charm can then be determined through the following two 
steps, summarized in Extended Data Fig. 1. First, all of the PDFs, includ- 
ing the charm PDF, are parametrized in the 4FNS at an input scale Q, 
and evolved using NNLO perturbative QCD to Q # Q,. These evolved 
PDFs can be used to compute physical cross-sections, also at NNLO, 
whichthen are compared toa global dataset of experimental measure- 
ments. The result of this first step in our procedure is a Monte Carlo 
representation of the probability distribution for the 4FNS PDFs at the 
input parametrization scale Qo. 

Next, this 4FNS charm PDF is transformed to the 3FNS at some match- 
ing scale Q.. Note that the choice of both Q, and Q. are immaterial—the 
former because perturbative evolution is invertible, so results for the 
PDFs donot depend onthe choice of parametrization scale Q,; the latter 
because the 3FNS charmis scale independent, so it does not depend on 
the value of Q.. Both statements of course hold up to fixed perturba- 
tive accuracy, and are violated by missing higher order corrections. In 
practice, we parametrize PDFs at the scale Q, = 1.65 GeV and perform the 
inversion at a scale chosen equal to the charm mass Q. = m, = 1.51 GeV. 

The scale-independent 3FNS charm PDF is then the sought-for intrin- 
sic charm. 


Global QCD analysis 

The 4FNS charm PDF and its associated uncertainties is determined 
by means of a global QCD analysis within the NNPDF4.0 framework. 
All PDFs, including the charm PDF, are parametrized at Q, = 1.65 GeV 
inamodel-independent manner using a neural network, whichis fitted 
to data using supervised machine learning techniques. The Monte 
Carlo replica method is deployed to ensure a faithful uncertainty esti- 
mate. Specifically, we express the 4FNS total charm PDF (c*=c+C) in 
terms of the output neurons associated with the quark singlet 2 and 
non-singlet 7,, distributions (see Section 3.1 of ref. *), as 


xC"(x, Qo 38) = (x*2(1- x)FENNy(x, 8) — x 7i5(1 — x)FTISNNr, (x, 8))/4, (1) 


in which NN, 9) is the ith output neuron of a neural network with 
inputx and parameters 8, and («,, 8.) are preprocessing exponents. A 
crucial feature of equation (1) is that no ad hoc specific model assump- 
tions are used: the shape and size of xc’(x, Qo) are entirely determined 
from experimental data. Hence, our determination of the 4FNS fitted 
charm PDF, and thus of the intrinsic charm, is unbiased. 

The neural network parameters 8 in equation (1) are determined by 
fitting an extensive global dataset that consists of 4,618 cross-sections 
froma wide range of different processes, measured over the yearsina 
variety of fixed-target and collider experiments (see ref. >for a complete 
list). Extended Data Fig. 2 shows the kinematic coverage in the (x, Q) 
plane covered by these cross-sections, in which Qis the scale, and xis 
the parton momentum fraction that corresponds to leading-order kin- 
ematics. Many of these processes provide direct or indirect sensitivity 


tothe charm content of the proton. Particularly important constraints 
come from Wand Z production from ATLAS, CMS and LHCbas well as 
from neutral and charged current deep-inelastic scattering structure 
functions from HERA. The 4FNS PDFs at the input scale Q, are related to 
experimental measurements at Q # Q, by means of NNLO QCD calcula- 
tions, including the FONLL-C general-mass scheme for deep-inelastic 
scattering” generalized to allow for fitted charm’. 

We have verified (see Supplementary Section C and Section D) that 
the determination of 4FNS charm PDF equation (1) and the ensuing 
3FNS intrinsic charm PDF are stable on variations of methodology 
(PDF parametrization basis), input dataset and values of parameters 
(the charm mass) of the standard model. We have also studied the sta- 
bility of our results on replacing the current NNPDF4.0 methodology’ 
with the previous NNPDF3.1 methodology”. The results are perfectly 
consistent. Indeed, the old methodology leads to larger uncertain- 
ties, corresponding to a moderate reduction of the local statistical 
significance for intrinsic charm, and to a central value that is within 
the smaller error band of our current result. 

A determination in which the vanishing of intrinsic charm is imposed 
has also been performed. In this case, the fit quality markedly deterio- 
rates: the values of the x’ per data point of 1.162, 1.26 and 1.22 for total, 
Drell-Yan and neutral-current deep-inelastic scattering data, respec- 
tively, found when fitting charm, are increased to 1.198, 1.31 and 1.28 
when the vanishing of intrinsic charm is imposed. The absolute wors- 
ening of the total x’ when the vanishing of intrinsic charm is imposed 
is therefore of 166 units, corresponding to a 2o effect in units of 


O,2=. [Naat 


Calculation of the 3FNS charm PDF 

The Monte Carlo representation of the probability distribution asso- 
ciated with the 4FNS charm PDF determined by the global analysis 
contains an intrinsic component mixed with a perturbatively gener- 
ated contribution, with the latter becoming larger inthex < 0.1region 
as the scale Q is increased. To extract the intrinsic component, we 
transform PDFs to the 3FNS at the scale Q, = m, = 1.51 GeV using EKO, 
a Python open-source PDF evolution framework (see Supplementary 
Section A). Inits current implementation, EKO performs QCD evolu- 
tion of PDFs to any scale up to NNLO. For the sake of the current anal- 
ysis, N?7LO matching conditions have also beenimplemented, by using 
the results of refs. ?** for O(a2) operator matrix elements so that the 
direct and inverse transformations from the 3FNS to the 4FNS can be 
performed at one order higher. The N°LO contributions to the match- 
ing conditions area subset of the full N°7LO terms that would be required 
to performa PDF determination to one higher perturbative order, and 
would also include N°LO contributions to QCD evolution that are 
unknown at present. Therefore, our results have NNLO accuracy, and 
we can use the N°LO contributions to the O(a2)corrections tothe heavy 
quark matching conditions only as a way to estimate the size of the 
missing higher orders. Indeed, these corrections havea very significant 
impact onthe perturbatively generated component (see Supplemen- 
tary Section B). They become large for x < 0.1, which coincides with 
the region dominated by the perturbative component of the charm 
PDF, and are relatively small for the valence region where intrinsic 
charm dominates. 


Z production in association with charm-tagged jets 

The production of Z bosons in association with charm-tagged jets (or 
alternatively, with identified D mesons) at the LHC is directly sensitive 
to the charm content of the proton through the dominant gc > Zc 
partonic scattering process. Measurements of this process at the for- 
ward rapidities covered by the LHCb acceptance provide access to the 
large-x region where the intrinsic contribution is expected to dominate. 
This isin contrast with the corresponding measurements from ATLAS 
and CMS, which become sensitive to intrinsic charm only at rather 
larger values of pt than those accessible experimentally at present. 


We have obtained theoretical predictions for Z+ charm produc- 
tion at LHCb with NNPDF4.0, based on NLO QCD calculations using 
POWHEG-BOX interfaced to Pythia8 with the Monash 2013 tune for 
showering, hadronization and underlying event. Acceptance require- 
ments and event selection follow the LHCb analysis, in which, in par- 
ticular, charm jets are defined as those anti-k,R = 0.5 jets containing a 
reconstructed charmed hadron. The ratio between charm-tagged and 
untagged Z + jet events can then be compared with the LHCb measure- 
ments 


NM(c-tagged jets; y,) _ o(pp > Z+ charm jet; y,) 
NGets; y,) o(pp > Z+ jet; y,) 


R&(y,) = > @) 


as a function of the Z-boson rapidity y, (see Supplementary Section 
G for details). The more forward the rapidity y,, the higher the values 
of the charm momentum x being probed. Furthermore, we have also 
included the LHCb measurements in the global PDF determination by 
means of Bayesian reweighting (see Supplementary Section G). 


Data availability 


Data are available from the corresponding author upon reasonable 
request. The experimental data used to perform the NNPDF4.0 analysis 
are available from https://docs.nnpdf.science/. 


Code availability 


The analysis presented in this work has been carried out using two 
open-source software frameworks, NNPDF for the global PDF deter- 
mination and EKO for the calculation of the 3FNS charm. These codes 
are publicly available from https://docs.nnpdf.science/ and https:// 
eko.readthedocs.io/, respectively. Both the LHAPDF grids produced 


inthis work and the version of EKO with the respective run cards used 
are available from http://nnpdf.mi.infn.it/nnpdf4-0-charm-study/. 
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4FNS CHARM PDF CONSTRAINED BY EXPERIMENTAL DATA FOR Q > Qo 
e NNPDF4.0 dataset e NNLO QCD calculations 
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e Deep-learning parametrisation e Monte Carlo representation of uncertainties 


QCD evolution 
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e Scale-independent e PDF and MHO uncertainties 


Extended Data Fig. 1| Evaluation of the charm PDF inthe 3FNS. The 4FNS charm PDF is parametrized at Q, and evolved to all Q, where it is constrained by the 
NNPDF4.0 global dataset. Subsequently, it is transformed to the 3FNS where (if nonzero) it provides the intrinsic charm component. 
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Extended Data Fig. 2 | Kinematic coverage of the NNPDF4.0 determination. cross-sections have been classified into the main different types of processes 
The kinematic coverage in the (x, Q) plane covered by the 4,618 cross-sections entering the global analysis. 
used for the determination of the charm PDF in the present work. These 


